The Communication Performance of the Cray T3D and its Effect on Iterative Solvers

نویسندگان

Y. F. Hu

D. R. Emerson

R. J. Blake

چکیده

On many distributed memory systems, such as workstation clusters or the Intel iPSC/860, the multigrid algorithm suffers from having extensive communication requirements and, in general, it is not very competitive in comparison to the conjugate gradient algorithm. This is in contrast to the sequential problem whereby the multigrid algorithm is very effective in reducing the global residual, particularly for very large linear systems of equations. These two algorithms are now compared on the Cray T3D for solving very large systems of linear equations (resulted from grids of the order 2563 cells). The communication performance of the Cray T3D is first measured by the standard ping-pong tests and also by practical communication tasks that are found frequently in CPD calculations. It is found that the Cray T3D has a low latency (= 6 ps) and a high bandwidth interprocessor communication (120 MB/s) when the low-level intrinsic communication routines are used. As a result, the multigrid algorithm is found to be very competitive when compared with the conjugate gradient algorithm for solving the very large linear systems arising from the Direct Numerical Simulation of turbulent Combustion (DNSC). Results are contrasted by those on the Intel iPSC/%O.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MPP Solution of Rayleigh - Bénard - Marangoni Flows

A domain decomposition strategy and parallel gradient-type iterative solution scheme have been developed and implemented for computation of complex 3D viscous flow problems involving heat transfer and surface tension effects. Special attention has been paid to the kernels for the computationally intensive matrix-vector products and dot products, to memory management, and to overlapping communic...

متن کامل

Parallel Implementation of a 3-d Subband Decomposition Algorithm for Digital Image Sequence Compression on the Cray T3d

This paper presents an eecient massively parallel implementation on the CRAY T3D of a digital image sequence compression scheme based on a 3-D subband decomposition. This compression method has been selected to be implemented on the CRAY T3D for its high potential of parallelization, its high computational complexity and its scientiic interest. This implementation has been performed in C, using...

متن کامل

Parallel Iterative Solvers and Preconditioners Using Approximate Hierarchical Methods ( An Extended

In this paper, we report results of the performance, convergence, and accuracy of a parallel GMRES solver for Boundary Element Methods. The solver uses a hierarchical approximate matrix-vector product based on a hybrid Barnes-Hut / Fast Multipole Method. We study the impact of various accuracy parameters on the convergence and show that with minimal loss in accuracy, our solver yields significa...

متن کامل

CellFlow: A Parallel Rendering Scheme for Distributed Memory Architectures

CellFlow is an animation system that exploits frame coherency to implement a lookahead scheme of object dataflow. The implementation of this scheme uses the communication features of modern scalable multicomputers to achieve good speedup by means of latency hiding. We demonstrate the performance of our approach in the field of volume rendering by implementing incremental rotation of the volumet...

متن کامل

A Cray T3D Performance Study

We carry out a performance study using the Cray T3D parallel supercomputer to illustrate some important features of this machine. Timing experiments show the speed of various basic operations while more complicated operations give some measure of its parallel performance.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Parallel Computing

دوره 22 شماره

صفحات -

تاریخ انتشار 1996

The Communication Performance of the Cray T3D and its Effect on Iterative Solvers

نویسندگان

چکیده

منابع مشابه

MPP Solution of Rayleigh - Bénard - Marangoni Flows

Parallel Implementation of a 3-d Subband Decomposition Algorithm for Digital Image Sequence Compression on the Cray T3d

Parallel Iterative Solvers and Preconditioners Using Approximate Hierarchical Methods ( An Extended

CellFlow: A Parallel Rendering Scheme for Distributed Memory Architectures

A Cray T3D Performance Study

عنوان ژورنال:

اشتراک گذاری